{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "c9bea21a-2a4e-4fb0-9081-d6692a0f736a",
   "metadata": {},
   "source": [
    "## **Python `requests` 库详解**\n",
    "\n",
    "### **课程目标**\n",
    "\n",
    "- 学习如何使用 `requests` 库发送 HTTP 请求\n",
    "- 理解 GET 和 POST 请求的区别\n",
    "- 学习如何处理响应数据\n",
    "- 学会发送带参数的请求、文件上传和处理异常\n",
    "\n",
    "---\n",
    "\n",
    "### **1. `requests` 库简介**\n",
    "\n",
    "- `requests` 是一个用于简化 HTTP 请求的 Python 库，使发送网络请求更容易。\n",
    "\n",
    "- 安装：\n",
    "\n",
    "  ```bash\n",
    "  pip install requests\n",
    "  ```\n",
    "\n",
    "### **2. HTTP 协议简介**\n",
    "\n",
    "- HTTP 是一种用于通信的协议。常见的请求方法包括：\n",
    "  - **GET**：获取资源\n",
    "  - **POST**：向服务器提交数据\n",
    "  - **PUT**：更新资源\n",
    "  - **DELETE**：删除资源\n",
    "\n",
    "---\n",
    "\n",
    "### **3. GET 请求**\n",
    "\n",
    "GET 请求用于从服务器获取数据。`httpbin` 提供了测试 GET 请求的接口。\n",
    "\n",
    "#### 示例代码："
   ]
  },
  {
   "cell_type": "code",
   "id": "90042d3a-a0a5-4276-a36c-4dc379b50212",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-09-08T14:51:59.270205Z",
     "start_time": "2024-09-08T14:51:58.119463Z"
    }
   },
   "source": [
    "import requests\n",
    "\n",
    "# 向 httpbin 发送 GET 请求\n",
    "response = requests.get('https://httpbin.org/get')\n",
    "\n",
    "# 打印状态码，状态码为 200 表示请求成功\n",
    "print(f'状态码: {response.status_code}')\n",
    "\n",
    "# 打印响应内容（文本格式）\n",
    "print(f'响应内容: {response.text}')\n",
    "\n",
    "# 如果服务器返回 JSON，可以使用 .json() 解析为字典\n",
    "data = response.json()\n",
    "print(f'解析后的 JSON 数据: {data}')\n"
   ],
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "状态码: 200\n",
      "响应内容: {\n",
      "  \"args\": {}, \n",
      "  \"headers\": {\n",
      "    \"Accept\": \"*/*\", \n",
      "    \"Accept-Encoding\": \"gzip, deflate, br, zstd\", \n",
      "    \"Host\": \"httpbin.org\", \n",
      "    \"User-Agent\": \"python-requests/2.32.2\", \n",
      "    \"X-Amzn-Trace-Id\": \"Root=1-66ddba0f-79b763ec262109c4685d1494\"\n",
      "  }, \n",
      "  \"origin\": \"221.15.159.55\", \n",
      "  \"url\": \"https://httpbin.org/get\"\n",
      "}\n",
      "\n",
      "解析后的 JSON 数据: {'args': {}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate, br, zstd', 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.32.2', 'X-Amzn-Trace-Id': 'Root=1-66ddba0f-79b763ec262109c4685d1494'}, 'origin': '221.15.159.55', 'url': 'https://httpbin.org/get'}\n"
     ]
    }
   ],
   "execution_count": 1
  },
  {
   "cell_type": "markdown",
   "id": "4c9d0d9b-4cdc-46fa-ab4b-9566c4362f42",
   "metadata": {},
   "source": [
    "#### 讲解：\n",
    "\n",
    "1. `requests.get()` 用于发送 GET 请求。\n",
    "2. `response.status_code` 是服务器返回的状态码，200 表示成功。\n",
    "3. `response.text` 是服务器返回的响应内容，通常是字符串。\n",
    "4. 如果响应内容是 JSON 格式，`response.json()` 可以将其解析为 Python 字典。\n",
    "\n",
    "---\n",
    "\n",
    "### **4. GET 请求带参数**\n",
    "\n",
    "有时候需要通过 URL 参数向服务器传递信息。这时我们可以使用 `params` 参数来传递查询参数。\n",
    "\n",
    "#### 示例代码："
   ]
  },
  {
   "cell_type": "code",
   "id": "f62bddad-19aa-4fd8-a673-2ca10af228ee",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-09-08T14:52:00.850610Z",
     "start_time": "2024-09-08T14:51:59.293035Z"
    }
   },
   "source": [
    "import requests\n",
    "\n",
    "# 构建查询参数\n",
    "params = {\n",
    "    'name': 'Alice',\n",
    "    'age': 25\n",
    "}\n",
    "\n",
    "# 发送带参数的 GET 请求\n",
    "response = requests.get('https://httpbin.org/get', params=params)\n",
    "\n",
    "# 打印请求的 URL，包含了查询参数\n",
    "print(f'请求的 URL: {response.url}')\n",
    "\n",
    "# 打印返回的响应内容\n",
    "print(f'响应内容: {response.text}')\n"
   ],
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "请求的 URL: https://httpbin.org/get?name=Alice&age=25\n",
      "响应内容: {\n",
      "  \"args\": {\n",
      "    \"age\": \"25\", \n",
      "    \"name\": \"Alice\"\n",
      "  }, \n",
      "  \"headers\": {\n",
      "    \"Accept\": \"*/*\", \n",
      "    \"Accept-Encoding\": \"gzip, deflate, br, zstd\", \n",
      "    \"Host\": \"httpbin.org\", \n",
      "    \"User-Agent\": \"python-requests/2.32.2\", \n",
      "    \"X-Amzn-Trace-Id\": \"Root=1-66ddba10-4c914d7c362a38ce71865ff0\"\n",
      "  }, \n",
      "  \"origin\": \"221.15.159.55\", \n",
      "  \"url\": \"https://httpbin.org/get?name=Alice&age=25\"\n",
      "}\n",
      "\n"
     ]
    }
   ],
   "execution_count": 2
  },
  {
   "cell_type": "markdown",
   "id": "5bc1beb8-8c93-4694-ba91-8b641c1c52b9",
   "metadata": {},
   "source": [
    "#### 讲解：\n",
    "\n",
    "1. 使用 `params` 参数将查询字符串附加到 URL 中，例如：`?name=Alice&age=25`。\n",
    "2. `response.url` 可以查看包含查询参数的完整 URL。\n",
    "3. `httpbin.org/get` 会返回服务器接收到的所有参数，便于测试。\n",
    "\n",
    "---\n",
    "\n",
    "### **5. POST 请求**\n",
    "\n",
    "POST 请求用于向服务器提交数据。`httpbin` 提供了一个 `post` 接口用于测试。\n",
    "\n",
    "#### 示例代码："
   ]
  },
  {
   "cell_type": "code",
   "id": "6ec4c9bc-67b3-4a91-998e-ca548c15bb6d",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-09-08T14:52:02.708626Z",
     "start_time": "2024-09-08T14:52:00.905127Z"
    }
   },
   "source": [
    "import requests\n",
    "\n",
    "# 要发送的数据\n",
    "data = {\n",
    "    'username': 'test_user',\n",
    "    'password': 'test_pass'\n",
    "}\n",
    "\n",
    "# 发送 POST 请求\n",
    "response = requests.post('https://httpbin.org/post', data=data)\n",
    "\n",
    "# 打印状态码\n",
    "print(f'状态码: {response.status_code}')\n",
    "\n",
    "# 打印返回的 JSON 响应\n",
    "print(f'响应内容: {response.json()}')\n"
   ],
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "状态码: 200\n",
      "响应内容: {'args': {}, 'data': '', 'files': {}, 'form': {'password': 'test_pass', 'username': 'test_user'}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate, br, zstd', 'Content-Length': '37', 'Content-Type': 'application/x-www-form-urlencoded', 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.32.2', 'X-Amzn-Trace-Id': 'Root=1-66ddba11-05b1754c1654443b14266f47'}, 'json': None, 'origin': '221.15.159.55', 'url': 'https://httpbin.org/post'}\n"
     ]
    }
   ],
   "execution_count": 3
  },
  {
   "cell_type": "markdown",
   "id": "086a7672-b91b-4ee6-8eee-5977e553a1a8",
   "metadata": {},
   "source": [
    "#### 讲解：\n",
    "\n",
    "1. `requests.post()` 用于发送 POST 请求，提交的数据通过 `data` 参数传递。\n",
    "2. `httpbin.org/post` 会返回服务器接收到的 POST 数据，便于调试。\n",
    "3. 服务器返回的响应通常为 JSON，使用 `.json()` 方法解析。\n",
    "\n",
    "---\n",
    "\n",
    "### **6. 上传文件**\n",
    "\n",
    "可以使用 `requests` 发送文件数据，例如上传文件到服务器。\n",
    "\n",
    "#### 示例代码"
   ]
  },
  {
   "cell_type": "code",
   "id": "c54ea5d6-c5f1-4670-a6b8-342ae2b9390c",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-09-08T14:52:03.638921Z",
     "start_time": "2024-09-08T14:52:02.731923Z"
    }
   },
   "source": [
    "import requests\n",
    "\n",
    "# 要上传的文件\n",
    "files = {\n",
    "    'file': open('example.txt', 'rb')\n",
    "}\n",
    "\n",
    "# 发送文件上传请求\n",
    "response = requests.post('https://httpbin.org/post', files=files)\n",
    "\n",
    "# 打印返回的响应内容\n",
    "print(f'响应内容: {response.json()}')\n"
   ],
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "响应内容: {'args': {}, 'data': '', 'files': {'file': '商务数据采集与分析'}, 'form': {}, 'headers': {'Accept': '*/*', 'Accept-Encoding': 'gzip, deflate, br, zstd', 'Content-Length': '174', 'Content-Type': 'multipart/form-data; boundary=1767426d18cd1c0419b0e904e127c12d', 'Host': 'httpbin.org', 'User-Agent': 'python-requests/2.32.2', 'X-Amzn-Trace-Id': 'Root=1-66ddba13-47a344753c9b0ef70e56f20a'}, 'json': None, 'origin': '221.15.159.55', 'url': 'https://httpbin.org/post'}\n"
     ]
    }
   ],
   "execution_count": 4
  },
  {
   "cell_type": "markdown",
   "id": "5d3f08fc-5509-41c1-aa5a-e5e1c081e6c1",
   "metadata": {},
   "source": [
    "#### 讲解：\n",
    "\n",
    "1. `files` 参数用于上传文件，文件需要以二进制模式打开（`rb`）。\n",
    "2. `httpbin.org/post` 会返回服务器接收到的文件信息。\n",
    "\n",
    "---\n",
    "\n",
    "### **7. 自定义请求头**\n",
    "\n",
    "有时需要在请求中添加自定义的 HTTP 头，比如 `User-Agent` 或 `Authorization`。\n",
    "\n",
    "#### 示例代码："
   ]
  },
  {
   "cell_type": "code",
   "id": "c3d2c815-edd8-43bf-a3a0-a15f89779eef",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-09-08T14:52:04.624681Z",
     "start_time": "2024-09-08T14:52:03.664556Z"
    }
   },
   "source": [
    "import requests\n",
    "\n",
    "# 自定义 HTTP 头\n",
    "headers = {\n",
    "    'User-Agent': 'my-app/1.0'\n",
    "}\n",
    "\n",
    "# 发送带自定义头的 GET 请求\n",
    "response = requests.get('https://httpbin.org/get', headers=headers)\n",
    "\n",
    "# 打印返回的响应内容\n",
    "print(f'响应内容: {response.text}')\n"
   ],
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "响应内容: {\n",
      "  \"args\": {}, \n",
      "  \"headers\": {\n",
      "    \"Accept\": \"*/*\", \n",
      "    \"Accept-Encoding\": \"gzip, deflate, br, zstd\", \n",
      "    \"Host\": \"httpbin.org\", \n",
      "    \"User-Agent\": \"my-app/1.0\", \n",
      "    \"X-Amzn-Trace-Id\": \"Root=1-66ddba14-528210b21d79308c7f92632b\"\n",
      "  }, \n",
      "  \"origin\": \"221.15.159.55\", \n",
      "  \"url\": \"https://httpbin.org/get\"\n",
      "}\n",
      "\n"
     ]
    }
   ],
   "execution_count": 5
  },
  {
   "cell_type": "markdown",
   "id": "d951fcdc-bbbf-419a-aaa7-fb8666b3d3d3",
   "metadata": {},
   "source": [
    "#### 讲解：\n",
    "\n",
    "1. `headers` 参数用于添加自定义 HTTP 头。\n",
    "2. 常见的头包括 `User-Agent`（标识客户端）和 `Authorization`（身份认证）。\n",
    "\n",
    "---\n",
    "\n",
    "### **8. 处理异常**\n",
    "\n",
    "在网络请求中，可能会出现连接失败、超时等问题。可以通过捕获异常进行处理。\n",
    "\n",
    "#### 示例代码："
   ]
  },
  {
   "cell_type": "code",
   "id": "6f49ab50-ab75-4415-9bfe-88ea019119a8",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-09-08T14:52:10.331080Z",
     "start_time": "2024-09-08T14:52:04.638652Z"
    }
   },
   "source": [
    "import requests\n",
    "\n",
    "try:\n",
    "    # 设置超时时间为 5 秒\n",
    "    response = requests.get('https://httpbin.org/delay/10', timeout=5)\n",
    "    print(f'状态码: {response.status_code}')\n",
    "except requests.Timeout:\n",
    "    print('请求超时')\n",
    "except requests.RequestException as e:\n",
    "    print(f'请求失败: {e}')\n"
   ],
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "请求超时\n"
     ]
    }
   ],
   "execution_count": 6
  },
  {
   "cell_type": "markdown",
   "id": "fe0133fa-eaeb-40ef-b738-a8e168b89627",
   "metadata": {},
   "source": [
    "#### 讲解：\n",
    "\n",
    "1. `timeout` 参数用于设置请求超时时间。\n",
    "2. 使用 `try-except` 块捕获异常，例如 `requests.Timeout` 处理超时异常，`requests.RequestException` 处理所有请求相关的异常。\n",
    "\n",
    "---\n",
    "\n",
    "### **9. 会话与 Cookie**\n",
    "\n",
    "`requests.Session` 对象允许你跨多个请求保持某些参数（例如 Cookie）。\n",
    "\n",
    "#### 示例代码："
   ]
  },
  {
   "cell_type": "code",
   "id": "e0f85c00-9908-46b4-a45a-41f1fcd9f748",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-09-08T14:52:13.381006Z",
     "start_time": "2024-09-08T14:52:10.354672Z"
    }
   },
   "source": [
    "import requests\n",
    "\n",
    "# 创建一个会话对象\n",
    "session = requests.Session()\n",
    "\n",
    "# 发送请求，设置 Cookie\n",
    "session.get('https://httpbin.org/cookies/set/sessioncookie/123456789')\n",
    "\n",
    "# 发送另一个请求，查看 Cookie\n",
    "response = session.get('https://httpbin.org/cookies')\n",
    "\n",
    "# 打印返回的响应内容，查看 Cookie 是否被保留\n",
    "print(f'响应内容: {response.text}')\n"
   ],
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "响应内容: {\n",
      "  \"cookies\": {\n",
      "    \"sessioncookie\": \"123456789\"\n",
      "  }\n",
      "}\n",
      "\n"
     ]
    }
   ],
   "execution_count": 7
  },
  {
   "cell_type": "markdown",
   "id": "e95578db-dd24-4073-bee4-93a709bbf503",
   "metadata": {},
   "source": [
    " `requests` 库get请求实例"
   ]
  },
  {
   "cell_type": "code",
   "id": "3e0b1841-b296-4d44-8dbe-7f40be32a8a2",
   "metadata": {
    "ExecuteTime": {
     "end_time": "2024-09-08T14:52:13.774850Z",
     "start_time": "2024-09-08T14:52:13.471878Z"
    }
   },
   "source": [
    "import requests\n",
    "\n",
    "# 设置请求的 URL\n",
    "url = \"https://movie.douban.com/top250\"\n",
    "\n",
    "# 定义请求头，模拟浏览器\n",
    "headers = {\n",
    "    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36'\n",
    "}\n",
    "\n",
    "# 发送 GET 请求\n",
    "response = requests.get(url, headers=headers)\n",
    "\n",
    "# 检查请求是否成功\n",
    "if response.status_code == 200:\n",
    "    # 将网页内容保存到本地文件\n",
    "    with open(\"douban_top250.html\", \"w\", encoding='utf-8') as file:\n",
    "        file.write(response.text)\n",
    "    print(\"网页保存成功!\")\n",
    "else:\n",
    "    print(f\"请求失败，状态码: {response.status_code}\")\n"
   ],
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "网页保存成功!\n"
     ]
    }
   ],
   "execution_count": 8
  },
  {
   "cell_type": "markdown",
   "id": "6da44f7b-3776-4160-8889-3f2bccbcd3be",
   "metadata": {},
   "source": [
    "### 代码解释：\n",
    "\n",
    "1. **请求页面**：使用 `requests.get` 向目标 URL 发送 HTTP GET 请求。我们通过设置 `headers` 来模仿浏览器请求，避免被服务器拒绝（有些服务器会限制非浏览器请求）。\n",
    "\n",
    "2. **检查响应状态码**：通过 `response.status_code` 检查请求是否成功，通常状态码为 200 表示成功。\n",
    "\n",
    "3. **保存网页内容**：如果请求成功，我们将通过 `response.text` 获取网页的 HTML 内容，并将其保存到本地文件 `douban_top250.html` 中。使用 `open()` 函数创建文件，并通过 `file.write()` 将内容写入文件。\n",
    "\n",
    "4. **处理错误**：如果请求失败（例如网络问题或服务器错误），程序会打印状态码，方便调试。"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "289eac5c-6f29-4c43-a913-086d8021e119",
   "metadata": {},
   "source": [
    "http://localhost:8000/login.html 作为我们服务器网站"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7130425f-51f1-42ee-a7dc-7f1c3ed45509",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "7b65f948-13c4-4865-b554-f19e81a14183",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "登录失败，状态码: 501\n"
     ]
    }
   ],
   "source": [
    "import requests\n",
    "\n",
    "# 创建会话对象，保持会话\n",
    "session = requests.Session()\n",
    "\n",
    "# 定义登录页面和登录后页面的URL\n",
    "login_url = 'http://127.0.0.1:5000/login'\n",
    "home_url = 'http://127.0.0.1:5000/'\n",
    "\n",
    "# 定义登录表单数据\n",
    "login_data = {\n",
    "    'username': 'zcfe',  # 用户名\n",
    "    'password': '123456'  # 密码\n",
    "}\n",
    "\n",
    "# 第一步：获取登录页面\n",
    "response = session.get(home_url)\n",
    "\n",
    "# 检查登录页面请求是否成功\n",
    "if response.status_code == 200:\n",
    "    print(\"成功访问登录页面\")\n",
    "\n",
    "    # 第二步：发送 POST 请求，提交登录表单\n",
    "    login_response = session.post(login_url, data=login_data)\n",
    "\n",
    "    # 检查是否登录成功\n",
    "    if login_response.status_code == 200:\n",
    "        print(\"登录成功\")\n",
    "\n",
    "        # 第三步：保存登录后的页面\n",
    "        with open('logged_in_page.html', 'w', encoding='utf-8') as file:\n",
    "            file.write(login_response.text)\n",
    "\n",
    "        print(\"登录后的页面已保存为 'logged_in_page.html'\")\n",
    "    else:\n",
    "        print(f\"登录失败，状态码: {login_response.status_code}\")\n",
    "else:\n",
    "    print(f\"无法访问登录页面，状态码: {response.status_code}\")\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "53bd0d46-36aa-4e0f-85e8-c4f386ee034b",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "38837b93-d00f-4c97-a801-ee8344cce246",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "28726fd2-ba38-45a6-b8ab-c0ca4ddb396d",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "612a7de0-2b51-4b7b-9451-6ac4efa3bad5",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
